Probabilistic Language Modelling

ثبت نشده
چکیده

Language models assign probabilities to strings of symbols. Their interpretation is reviewed and applied to text classification. A language recogniser is constructed from Bayes’ theorem and a simple bigram model. This provides near perfect results on sentences of text and motivates a mixture language model. Hidden Markov models (HMM) are reviewed as a method of capturing order over different length scales and used to construct a mixture model. This allows segmentation of text into unknown languages and the extraction of foreign words in known languages from English text. Future directions are discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Context-dependent probabilistic hierarchical sublexical modelling using finite state transducers

This paper describes a unified architecture for integrating sub-lexical models with speech recognition, and a layered framework for context-dependent probabilistic hierarchical sublexical modelling. Previous work [1, 2, 3] has demonstrated the effectiveness of sub-lexical modelling using a core context-free grammar (CFG) augmented with context-dependent probabilistic models. Our major motivatio...

متن کامل

A Probabilistic Approach to Modelling Spatial Language with Its Application To Sensor Models

We examine why a probabilistic approach to modelling the various components of spatial language is the most practical for spatial algorithms in which they can be employed, and examine such models for prepositions such as `between' and `by'. We provide an example of such a probabilistic treatment by exploring a novel application of spatial models to the induction of the occupancy of an object in...

متن کامل

A hierarchical Dirichlet language model

We discuss a hierarchical probabilistic model whose predictions are similar to those of the popular language modelling procedure known as 'smoothing'. A number of interesting differences from smoothing emerge. The insights gained from a probabilistic view of this problem point towards new directions for language modelling. The ideas of this paper are also applicable to other problems such as th...

متن کامل

Modelling Probabilistic Inference Networks and Classification in Probabilistic Datalog

Probabilistic Graphical Models (PGM) are a well-established approach for modelling uncertain knowledge and reasoning. Since we focus on inference, this paper explores Probabilistic Inference Networks (PIN’s) which are a special case of PGM. PIN’s, commonly referred as Bayesian Networks, are used in Information Retrieval to model tasks such as classification and ad-hoc retrieval. Intuitively, a ...

متن کامل

Experiences with Modelling Issues in Building Probabilistic Networks

Building a probabilistic network for a real-life application is a difficult and time-consuming task. Methodologies for building such a network, however, are still lacking. Also, literature on network-specific modelling issues is quite scarce. As we have developed a large probabilistic network for a complex medical domain, we have encountered and resolved numerous non-trivial modelling issues. S...

متن کامل

Acronym : QUASIMODO Deliverable no . : D 1 . 1 Title of Deliverable : Modelling Quantitative System Aspects

This deliverable describes the results of the QUASIMODO project on modelling quantitative system aspects. Keyword list: AADL, Arcade, architectural dependability evaluation, cost-bounded reachability, priced priced/weighted timed automata, probabilistic timed automata, probabilistic timed automata, probabilistic hybrid systems. ICT-FP7-STREP-214755 / QUASIMODO Page 2 of 12 Public

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002